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Abstract 

Independent component analysis (ICA) is essentially a method for extracting useful information 
from data. Independent component analysis finds underlying factors or components from 
multidimensional statistical data. ICA is distinguished from other methods in a way that, it looks 
for components that are both statistically independent, and non-gaussian. Since ICA algorithm is 
computationally complex and uses large volume of data sets, there is a need for technique that 
provides potentially faster and even real-time implementations for ICA algorithms for signal and 
image processing applications . Very large scale integration (VLSI) technology is a solution that 
provides Modularity, hierarchy, parallelism and satisfies these requirements. Reconfigurable 
modules play major role nowadays because they are highly reusable and ready to be retargeted 
to other ICA-related applications. However these solutions also have some limitations and 
Critical Challenges. This paper reviews basic concepts of ICA, existing methods of ICA, merits 
and demerits of its VLSI implementation. Though Review of ICA is done in several articles, 
review of ICA In VLSI, Major Challenges In ICA implementation is discussed in this paper in 
comprehensive manner. 

Index Terms — FPGA , Review of ICA, Statistical signal processing, VLSI. 



IEEE Member,Research Scholar, Anna University, Tamil nadujndia 

"IEEE Member,Principal Tejaa Sakthi Institute of Technology for Women, Tamil Nadu, India 

A Monthly Double-Blind Peer Reviewed Refereed Open Access International e-Journal - Included in the International Serial Directories 
Indexed & Listed at: Ulrich's Periodicals Directory ©, U.S.A., [«HTWffcflT 5 as well as in Cabell's Directories of Publishing Opportunities, U.S.A. 

International Journal of Management, IT and Engineering 
http://www.ijmra.us 



402 



August 

2013 





Volume 3, Issue 8 



ISSN: 2249-0558 



I. INTRODUCTION TO ICA 

ICA is related to conventional methods for analyzing large data sets, such as principal 
component analysis (PC A) and factor analysis (FA) .Whereas ICA finds a set of independent 
source signals, PC A and FA find a set of signals with a much weaker property than 
independence. PCA would extract a set of uncorrected signals from a set of mixtures. If these 
mixtures were microphone outputs then the extracted signals would simply be a new set of voice 
mixtures. In contrast, ICA would extract a set of independent signals from this set of mixtures, so 
that the extracted signals would be a set of single 
voices. ICA belongs to a class of blind source separation (BSS) methods for separating data into 
underlying informational components. Such data can take the form of images, sounds, 
telecommunication channels or stock market prices. The term "blind" is intended to imply that 
such methods can separate data into source signals even if very little is known about the nature of 
those source signals. When two people are speaking at the same time in a room containing two 
microphones and if each voice signal is examined at a fine time scale then it becomes apparent 
that the amplitude of one voice at any given point in time is unrelated to the amplitude of the 
other voice at that time .The reason that the amplitudes of the two voices are unrelated is that 
they are generated by two unrelated physical processes. If we know that the voices are unrelated 
then one key strategy for separating voice mixtures into their constituent voice components is to 
look for unrelated time-varying signals within these mixtures. Using this strategy, the extracted 
signals are unrelated, just as the voices are unrelated. Thus, each voice is unrelated to the others 
suggests a strategy for separating individual voices from mixtures of voices. This is the basic 
principle behind ICA. It can be used to separate not only mixtures of sounds, but mixtures of 
almost any type like electroencephalographic (EEG) signals, faces, fMRI data The defining 
feature of the extracted signals is that each extracted signal is statistically independent of all the 
other extracted signals. This leads us to the following definition of ICA. 

Given a set of observations of random variables (xl(t), x2(t) xn(t)), where t is the time or 

sample index. Assuming that they are generated as a linear mixture of independent components, 
Independent component analysis now consists of estimating both the matrix A and the si(t) when 
only xi(t) is observed as in (1). 
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X2(t) 



f Sl (t)\ 
S2 (t) 

Vn(t)J 



(1) 



It is assumed here that the number of independent components si (t) is equal to the number of 
observed variables; this is a simplifying assumption that is not completely necessary. 
Alternatively, ICA is also defined as finding a linear transformation given by a matrix W as in 
(2), so that the random variables y(i) in (2) are as independent as possible. 



Sfa(t) 

\Vn(t)J 



w 



xa(t) 



(2) 



This formulation is not really very different from the previous one, since after estimating A, its 
inverse gives W. The model in (1) can be estimated if and only if the components s(i) are 
nongaussian. This is a fundamental requirement that also explains the main difference between 
ICA and factor analysis, in which the nongaussianity of the data is not taken into account. Since 
in factor analysis, data is modeled as linear mixtures of some underlying factors, ICA could be 
considered as nongaussian factor analysis. 

H ICA PREPROCESSING 

One important fact about standard BSS methods such as ICA is, there must be at least as many 
different mixtures of a set of source signals as there are source signals. If the number of source 
signals is known to be less than the number of signal mixtures then the number of signals 
extracted by ICA can be reduced either by preprocessing signal mixtures using principal 
component analysis or by specifying the exact number of source signals to be extracted. It is 
highly recommended to perform preprocessing before applying the ICA algorithm in order to 
simplify the estimation process. The preprocessing of mixed signal involves finding the mixing 
matrix. Given a set of multivariate measurements, the purpose of preprocessing is to find a 
smaller set of variables with less redundancy that would give as good a representation as 
possible. The preprocessing of mixed signal involves Centering and whitening. 
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A. Centering 

Centering converts mixture(X) input to a zero-mean signal by subtracting mean from the 
incoming signal(X) as in Fig 1 
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Fig.l. Implementation of Centering 



B. Whitening 

Whitening process transforms this zero mean X to a new vector Z whose components are 
uncorrected with variances equal to unity. This process is carried out by Eigen Value 
Decomposition (EVD) of the Covariance Matrix (CX) of X. 

ni. ICA IN LITERATURE 

A. Algebraic ICA Algorithm 

The algorithm performs ICA by solving simultaneous equations derived from the definition of 
the independence. An algebraic solution to ICA is proposed by Taro Yamaguchi et al. in [17]. 
Absence of iterative calculation leads to reduction of processing time while the accuracy is 
maintained .The disadvantage of this algorithm is ,it becomes extremely complex when the 
number of sources goes more than two. It works very fast for two sources separation. 

B. Evolutionary ICA Algorithm 

Evolutionary computation techniques are very popular population search based optimization 
methods. By evolutionary mechanism like Genetic Algorithms and Swarm intelligence, optimal 
separating matrix that minimizes the dependence can be obtained. The population based search 
methods like GA and PSO converge to a global optimum unlike the case of gradient based 
methods. GA has been used for nonlinear blind source separation in [20] and for noise separation 
from electrocardiogram signals in [21]. Particle swarm optimization (PSO) is used in ICA 
technique in [4]. Currently, several evolutionary optimization algorithms are used in ICA. The 
only disadvantage of evolutionary computation based ICA techniques is the heavy computational 
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complexity. But with the advent of highly parallel processors and new technologies, these 
methods provide competitive solutions to the problems. 

C. Infomax Estimation or Maximum Likelihood Algorithm 

Maximum likelihood (ML) estimation is based on the assumption that the unknown parameters 
to be estimated are constants or no prior information is available. Due to its asymptotic 
optimality properties , Infomax Estimation is desirable choice when the number of samples is 
large . The parameters having highest probability for the observations act as the estimates. The 
simplest algorithm for maximizing the likelihood (also log-likelihood) is given by Bell and 
Sejnowski [7] by using stochastic gradient methods. The algorithm for ML estimation derived by 
Bell and Sejnowski is given in (4) 

AWcc[W T r 1 +E{g(Wx)x T } (4) 

In the case of basic ICA, both these principles amount to multiplying the right side of above 
equation by WxW . This gives (5) 
AW : x(I + E{g(y)y T })W (5) 

where y =Wx . After this modification the algorithm needs no sphering. This algorithm can be 
interpreted as a special case of nonlinear decorrelation algorithm described in pervious section. 

A Newton method for maximizing the likelihood has been introduced in [8]. Infomax principle 
is a very closely related maximum likelihood estimation principle for ICA . This is based on 
maximizing the output entropy or information flow of a neural network with nonlinear outputs. 
Hence it is named as infomax. 

D. Non-linear cross correlation based Algorithm: 

Principle of cancellation of non-linear cross correlation is used to estimate independent 
components in [1]. Non-linear cross correlations are of the form E{ gl(yi), g2(yj)} where gl and 
g2 are some suitably chosen nonlinearities. If yi and yj are independent, then these cross 
correlations are zero for yi and yj having symmetric densities. Jutten and Herault used this 
principle in [2] to update the non-diagonal terms of the matrix W which is given in (6). 

AW„ ' g, (y,)g 2 (v. ) for l jt j 

Yi are computed at every iteration and after convergence, Yi give the estimates of the 
independent components. However the algorithm converges only under severe restrictions [3]. 
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E. Nonlinear Decorrelation Algorithm 

To reduce the computational overhead by avoiding matrix inversions in [2] and to improve 
stability following algorithm has been proposed [5] for weight vector updation as in (7). 

AWx(I- gl (y)g 2 (y T ))W ( 7 ) 

where , the nonlinearities gl(.)and g2 (.)are applied separately on every component of the vector , 
and the identity matrix is replaced by any positive definite diagonal matrix. EASI algorithm has 
been proposed in [6]. According to EASI weight vector is updated as in (8), 

AWoc a yy* -g(y)y T + yg(y T ))W (8) 

The choice of the nonlinearities used in above rules is generally provided by the maximum 
likelihood or infomax approach. 

F. Nonlinear PCA Algorithm 

Another approach to ICA that is related to PCA is the non-linear method. This is sought for the 
input data that minimizes a least mean square error criterion. For linear case principal 
components are obtained and in some cases the nonlinear PCA approach gives independent 
components . In [9] ,the following version of a hierarchical PCA learning rule is introduced 
which is given in (9). 

i 

Aw x g(y,)x - g(y, g(y, )w. 



i-i 



(9) 



where g is a suitable non-linear scalar function. Algorithms for exactly maximizing the nonlinear 

PCA criteria are introduced in [10]. 

G. Gradient Descent One- Unit Neural Learning Rules: 

Using the principle of stochastic gradient descent, Simple algorithms from the one-unit 
contrast functions are derived. Considering whitened data, Hebbian like learning rule [11] [12] is 
obtained by taking instantaneous gradient of contrast function with respect to w using(lO) 



Aw x [E{G(w T x)} - E{G(v)} ]xg(w T x) 



(10) 



Such one unit algorithms are proposed in [13] using kurtosis. For estimation of several 
independent components, system of several units is needed. 
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H. Tensor based ICA Algorithm 

Estimation of independent components can also be done using higher-order cumulant tensors. 
Tensors are generalizations of matrices, or linear operators. Cumulant tensors are then 
generalizations of the covariance matrix Cx. The covariance matrix is the second order cumulant 
tensor, and the fourth order tensor is defined by fourth-order cumulants Eigenvalue 
decomposition (EVD) is used to whiten the data. 

Joint approximate diagonalization of eigenmatrices (JADE) proposed by Cardoso [14] is based 
on the principle of computing several cumulant Tensors. Due to the computational complexity of 
explicit tensor EVD, JADE is restricted to small dimensions. It is inferior to methods using 
likelihood or non-polynomial cumulants [15]. However, with low dimensional data, JADE is a 
competitive alternative to most popular FastICA algorithm. 

A similar and simpler approach that uses the EVD is the fourth-order blind identification 
(FOBI) method [16] .This deals with the EVD of the weighted correlation matrix. It is of 
reasonable complexity, and is probably the most efficient of all the ICA methods. However, it 
fails to separate the sources when they have identical kurtosis. There are also other approaches 
that include maximization of squared cumulants [18], and fourth-order cumulant based methods 
[19]. 

/. Fast ICA Algorithm 

One of the most popular solutions for BSS problem is Fast ICA [8] due to its simplicity and fast 
convergence. The Fast ICA learning rule finds a w such that the projection w x maximizes 
contrast function. Nongaussianity is measured by the approximation of contrast function, 
negentropy. The basic algorithm involves the preprocessing and a fixed-point iteration scheme 
for one unit. The independent components (ICs) can be estimated one by one using deflationary 
approach or can be estimated simultaneously. In the deflationary approach, it must be ensured that 
the rows w j of the separating matrix W are orthogonal. This can be done after every iteration 
step by subtracting from the current estimate wp the projections of all previously estimated with 
a kurtosis based contrast function. FastICA can be shown to converge globally to the IC's[12]. 
J. Modified Independent Component Analysis 

A new automatic method is introduced to eliminate electrocardiogram (ECG) noise in an 
electroencephalogram (EEG) or electrooculogram (EOG)[39]. It is based on a modification of the 
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independent component analysis (ICA) algorithm which gives promising results while using only 
a single-channel electroencephalogram (or electrooculogram) and the ECG. 
K. Recursively Regularized ICA 

A new method of frequency-domain blind source separation (FD-BSS), able to separate 
acoustic sources under highly reverberant challenging conditions is proposed [29]. In frequency- 
domain BSS, the separation is generally performed by applying independent component analysis 
(ICA) at each frequency envelope. 
L. ICA By Entropy Bound Minimization 

A novel independent component analysis (ICA) algorithm that uses the entropy estimate Is 
proposed in [30]. ICA is done by entropy bound minimization (ICA-EBM). A novel 
(differential) entropy estimatoris used to approximate the entropy. This algorithm adopts a line 
search procedure, and initially uses updates that constrain the demixing matrix to be orthogonal 
for robust performance. It has the ability to match sources that come from a wide range of 
distributions 
M. Wavelet ICA 

Due to some limitations of manual rejection like requirement of man power and time, 
Automatic artifact rejection is needed for effective real time artifact removal. In this paper [33], a 
novel Automatic Wavelet Independent Component Analysis for automatic EEG artifact removal 
is Proposed. AWICA is based on the joint use of the Wavelet Transform and ICA. It is a two-step 
procedure relying on the concepts of kurtosis and Renyi's entropy. The method here proposed is 
shown to yield improved success in terms of suppression of artifact components while reducing 
the loss of residual informative data. 
N. Binary ICA With OR Mixtures 

The classical independent components analysis (ICA) framework usually assumes linear 
combinations of independent sources over the field of real- valued numbers. Binary ICA for OR 
mixtures (bICA) is proposed in this paper[34], which can find applications in many domains 
including medical diagnosis, multi-cluster assignment, Internet tomography and network resource 
management, a deterministic iterative algorithm to determine the distribution of the latent 
random variables and the mixing matrix. 
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O. Discriminant Independent Component Analysis 

A conventional linear model based on Negentropy Maximization may not be optimal to give a 
discriminant model with good classification performance. In [35], a single-stage linear 
semisupervised extraction is proposed to project multivariate data linearly to a lower dimension 
where the features are maximally discriminant with minimal redundancy. The optimization 
problem is formulated as the maximization of linear summation of Negentropy and weighted 
functional measure of classification. Fisher linear discriminant is used as the functional measure 
of classification. Experimental results show improved classification performance when dlCA 
features are used for recognition tasks in comparison to unsupervised and supervised feature 
extraction techniques . 

P. Convex Divergence ICA for Blind Source Separation 

A novel contrast function for evaluating the dependence among sources is presented. A 
convex divergence ICA (C-ICA) is constructed and a nonparametric C-ICA algorithm is derived 
with different convexity parameters where the non-Gaussianity of source signals is characterized 
by the Parzen window-based distribution[36]. This specialized C-ICA significantly reduces the 
number of learning epochs during estimation of the demixing matrix. The convergence speed is 
improved by using the scaled natural gradient algorithm. When Experiment is done with 
instantaneous, noisy and convolutive mixtures of speech , music signals, the superiority of the 
proposed C-ICA to JADE, Fast-ICA, and the nonparametric ICA based on mutual information is 
well illustrated. 

In this paper Auditory evoked potential (AEP) recordings have been analyzed using 
independent component analysis (ICA) and variation in performance of different ICA algorithms 
used is observed in[31]. All the algorithms estimate the CI artifact reasonably well, although only 
one SOS algorithm is better positioned to estimate the AEP since it uses the temporal structure of 
this signal as part of the ICA process 

A new approach of artifact removal using S-transform (ST) is proposed in [37]. It provides an 
instantaneous time-frequency representation of a time-varying signal. It generates high magnitude 
5-coefficients at the instances of abrupt changes in the signal. 

Time-domain algorithms for blind separation of audio sources can be classified as being based 
either on a partial or complete decomposition of an observation space. The decomposition, is 
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mostly done under a constraint to reduce the computational burden. However, this constraint 
potentially restricts the performance. A novel time-domain algorithm based on a unconstrained 
decomposition of the observation space is proposed in [38]. The decomposition is done by an 
appropriate independent component analysis(ICA) algorithm independent components are 
grouped into clusters corresponding to the original sources. After estimating the responses of the 
original sources,the Components of the clusters are combined by a reconstruction procedure. 

A novel method for deflationary ICA, referred to as Robust ICA, is put forward in [28]. This 
technique performs exact line search optimization of the kurtosis contrast function. The step size 
leading to the global maximum of the contrast along the search direction is found among the 
roots of a fourth-degree polynomial.This polynomial rooting is performed algebraically, at low 
cost, at each iteration. RobustICA deals with real- and complex-valued mixtures of possibly 
noncircular sources and it avoids prewhitening .Asymptotic performance is improved due to The 
absence of prewhitening.The algorithm shows a very high convergence speed in terms of the 
computational cost required to reach extraction quality. 

IV. RECONFIGURABLE SOLUTIONS FOR ICA 

Fixed-point VLSI architecture for 2-Dimensional Kurtotic FastICA with reduced and 
optimized arithmetic units, was proposed by Amit Acharyya, Koushik Maharatna[21]. The 
efficiency is achieved through removal of division operation for eigenvector computation, 
replacement of division operations by multiplications and reduction of number of multipliers and 
adders for whitening matrix computation . The numerical error issue associated with the finite 
wordlength representation of fixed-point Arithmetic is solved by introducing suitable Scaling 
Factors (SF) and internal data-bus width variability wherever necessary. However,the impact of 
different algorithmic parameters like framelength, convergence threshold, are need to be 
investigated which may lead to further architectural optimization. 

Comparative study of implementation of ICA algorithm on a fixed point platform with respect 
to floating point processor is done in [22] . The accuracy and speed were found to be acceptable. 
In addition, the fixed point processor needs less space and consumes less power. But fixed point 
processor can handle only smaller range of real values. More work needs to be done in this 
direction to embed these codes in portable consumer devices, without further deterioration of 
energy efficiency. 
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Due to the computation complexities and convergence rates, ICA is very time-consuming for 
high volume or dimension data set like hyperspectral images. Hardware implementation provides 
not only an optimal parallelism environment, but also a potential faster and real-time solution. 
Synthesis of a parallel ICA (pICA) algorithm for Field Programmable Gate Array (FPGA) 
implementation is proposed in [23]. In this proposed method, the pICA is partitioned into three 
temporally independent functional modules, and each of them is synthesized individually.Al 
these modules are developed for reuse and retargeting purpose. All modules are then integrated 
into a design and development environment for performing FPGA synthesis, optimization, 
placement and routing. Synthesis of the pICA algorithm for hyperspectral image dimensionality 
reduction. The FPGA executes at the maximum frequency of 20.161MHz.The performance 
comparisons between the proposed and another two ICA-related FPGA implementations showed 
that the FPGA implementation of pICA has potential in performing complicated algorithms on 
large volume data sets. 

FPGA implementation of digital chip is reported with modular design concept in [24] An field 
programmable gate array (FPGA) implementation of independent component analysis (ICA) 
algorithm is reported for simultaneous ANC and BSS operations for speech enhancement in real 
time. In order to provide enormous computing power for ICA-based algorithms, a special digital 
processor is designed and implemented in FPGA. The chip design fully utilizes modular concept 
and several chips may be put together for complex applications with a large number of noise 
sources. Experiments done for ANC only, BSS only, and simultaneous ANC/BSS, demonstrates 
successful speech enhancement in real time. The chip is capable of up to 32-channel convolutive 
BSS/ANC. Experimental results with the FPGA and a test board for simultaneous two-channel 
BSS and four-channel 

ANC demonstrate that the final SNRs is about 16 dB, which is good enough for robust speech- 
recognition systems . 

Gradient flow is a signal conditioning technique for source separation and localization suited 
for arrays of very small aperture, i.e., of dimensions significantly smaller than the shortest 
wavelength in the sources. A mixed-signal VLSI system that operates on spatial and temporal 
differences of the acoustic field at very small aperture to separate and localize mixtures of 
traveling wave sources is presented in[25]. Various analog VLSI implementations of ICA exist in 
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the literature. Since digital adaptation offers the flexibility of reconfigurable ICA learning rules, 
digital implementations using DSP are common practice in the field. Miniature size of the 
microphone array enclosure (1 cm dia) and micro-power consumption of the VLSI hardware 
(25ouW) are key advantages of the approach, with applications to hearing aids, conferencing, 
multimedia, and surveillance. 

Among the various available BSS methods, Independent Component Analysis is one of the 
representative methods. A practical method using a parallel algorithm and architecture for 
hardware use in a blind source separation is investigated and a feedback network for real-time 
speech signal processing is designed in [26]. Since the network architecture is systolic, it is 
suitable for parallel processing. This paper covers the process from the systolic design of BSS to 
the hardware implementation using Xilinx FPGAs. The simulation results of this implementation 
returns satisfying results with robust qualities. This scheme is fast, reliable since the architectures 
are highly regular and in addition the processing can be done in real time. 

The FPGAs based on the reconfiguration technology provide most economic and efficient 
solutions to ICA algorithms [27]. This is because end users can modify and configure their designs 
multiple times. Specifically, recent rapid increase in the density of FPGAs has made it possible to 
implement large ICA designs in very efficient approach. Large amount of available standard 
libraries makes the design expense much cheaper and the design process much faster. The digital 
nonprogrammable ASICs such as standard-height library and mask gate arrays are also used to 
implement designs at high circuit density 

V. Conclusion 

Thus basic concepts of ICA, survey of existing ICA techniques including VLSI implementation 
is reviewed in ample way. Since real time operation of ICA is most important nowadays 
Reconfigurable modules of ICA play significant role in all- inclusive applications. Though there 
are some limitations in VLSI implementation, Compromise is needed due to design constraints. 
When speed is increased, there is a increase in area which can somewhat be compensated by 
adopting parallelism. Once Critical challenges and issues associated with the VLSI 
implementation of ICA algorithms are identified, high potential complicated ICA algorithms on 
large throughput can be provided 
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